1 Load the data

We will look at usage for the moment, but may want to add Applicability or combine the two

atlasUsage <- load_data("dravnieks_1985/behavior_2.csv") %>% 
  separate(Stimulus,c("Stimulus","Concentration"),sep="_")
## No local file: Attempting remote file access
#atlasApplicability <- load_data("dravnieks_1985/behavior_1.csv")

2 Visualize the odors with PCA

pca1<-prcomp(atlasUsage[,-c(1:2)],scale=T)
scores = as.data.frame(pca1$x)
ggplot(data=scores, aes(x=PC1, y=PC2, label=atlasUsage$Stimulus))+geom_point()+ geom_text_repel()
## Warning: ggrepel: 121 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

If you are familiar with the odor names, you can see that the lower left of the graph has some unpleasant molecules–sulfurous molecules like methyl thiobutyrate smell like rotten eggs and carboxylic acids like butanoic acid smell like sweaty feet. The lower right has fruity molecules like Aldehyde-C16 which smells of strawberries. You can see that the atlas has both a high and low concentration of Aldehyde-C16 and they are close on this plot, indicating both concentrations smell relatively similar.

3 Visualize the odors with an interactive PCA plot

Now we will make an interactive plot–you can mouseover the points to see the odor label

p <- ggplot(data=scores, aes(x=PC1, y=PC2, label=atlasUsage$Stimulus))+geom_point()
ggplotly(p)

4 Visualize individual odors

Spider plots are a great way to look at this data. Here I will use the fmsb package, which requires me to set up formatting. I am adding 2 lines to the dataframe: the possible max (100) and min (0) of each descriptor.

df.spider=rbind(rep(100,ncol(atlasUsage)) , rep(0,ncol(atlasUsage)) , atlasUsage)
rownames(df.spider)=df.spider$Stimulus
df.spider <- df.spider[,-c(1,2)]
#Set the odor to look at
odorNum=which(rownames(df.spider)=="ButanoicAcid")
#Drop descriptors with low usage to reduce clutter
over10 <- df.spider[odorNum,] > 10
over10[1]=FALSE #remove the first column, which is odor label
radarchart(df.spider[c(1,2,odorNum),over10],maxmin=TRUE,vlcex=0.3,title=rownames(df.spider)[odorNum])

Here is one more:

odorNum=which(rownames(df.spider)=="MuskGalaxolide")
#Drop descriptors with low usage to reduce clutter
over10 <- df.spider[odorNum,] > 10
over10[1]=FALSE #remove the first column, which is odor label
radarchart(df.spider[c(1,2,odorNum),over10],maxmin=TRUE,vlcex=0.3,title=rownames(df.spider)[odorNum])